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Abstract 

In order to characterise new bacteriocins produced by Streptococcus mutans we 
perform a complete bioinformatic analyses by scanning the genome sequence of 
strains UA159 and NN2025. By searching in the adjacent genomic context of the 
two-component signal transduction system we predicted the existence of many 
putative new bacteriocins' maturation pathways and some of them were only 
exclusive to a group of Streptococcus. Computational genomic and proteomic 
analysis combined to predictive functionnal analysis represent an alternative way for 
rapid identification of new putative bacteriocins as well as new potential 
antimicrobial drugs compared to the more traditional methods of drugs discovery 
using antagonism tests. 



Findings 

The increasing resistance of bacteria to antibiotics motives researches for new antimi- 
crobial compounds [1]. In this way bacteriocins which are small antibacterial riboso- 
mally synthetized peptides produced by bacteria represent promising candidates [2,3]. 
Bacteriocins acted on sensitive cells by punching pores in their membrane. To date, 
the bacteriocins produced by Gram positive bacteria are grouped in two major classes 
[4] but four classes are also proposed [5]. Lantibiotic class I and non-lantibiotic_class 
II bacteriocins display great diversity with regard to their structures, modes of action, 
and genetic determinants [4,6]. Typical bacteriocin biosynthesis operons are usually 
organised as a cluster of genes comprising the prepropeptide coding gene associated 
with genes for exportation and maturation (ATP-binding cassette (ABC) transporter 
and sometimes combined to a specific protease), genes conferring immunity to the 
inhibitory activity to prevent self-killing and occasionally genes involved in regulation 
of the production of the bacteriocin [6,3] . The expression of the bacteriocin gene clus- 
ter is under the control of a two-component signal transduction system (TCS) com- 
posed of an histidine kinase (HK) and its associated response regulator (RR) that are 
usually part of the cluster. The inducer can be either the bacteriocin itself or a bacter- 
iocin-like peptide [7]. 

Discovery of new bacteriocins traditionally rest upon functionnal assays based on the 
inhibition of specific target bacteria. Such method is limited and time-consuming 
regarding the culture condition for bacteriocin production with the indicator strains 
used. The growing of genomic data makes the detection of new bacteriocin peptides 
possible by using an in silico screening strategy and precise computational analyses. 




© 201 1 Nicolas; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons 
BiolVlGCl C^ntrBl Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in 
any medium, provided the original work is properly cited. 



Nicolas BioData Mining 201 1, 4:22 
http://www.biodatamining.Org/content/4/1/22 



Page 2 of 5 



Recently many research teams have bring to light existence of new type of bacteriocins 
using this strategy [8-11]. Furthermore, a very powerful tool for direct discovery of 
bacteriocins inside genomic data have been recently develop [12]. However, such tool 
build on well-known bacteriocins characteristics may overlook detection of new type of 
bacteriocins as bacteriocins detect by Haft methodoly are not found using BAGEL2 
[8,9]. Open reading frame detection and identification coding for short peptides includ- 
ing bacteriocin precursors inside genomes is generally recognised as difficult to per- 
form [13]. 

Our research group is interested in the discovery of new antibacterial compounds 
produced by Streptococcus mutans and named mutacins [14]. Based on the conserved 
organisation of bacteriocin biosynthesis operon, we screened the genomic context of 
the HK/RR genes found in the 5. mutans UA159 genome to detect new putative bac- 
teriocin-encoding genes (GenBank: AE014133) [15]. Following a profound inspection 
by bioinformatic analysis using available web tools we were able to identify new puta- 
tive bacteriocin maturation patchways in the S. mutans genome. 

The Microbial Signal Transduction database (MiST, http://mistdb.com) [16] was 
used to locate the HK/RR genes inside the S. mutans genome (Table 1). A set of small 
ORF encoding small peptides were identified around each TCS. By browsing the geno- 
mic context using the Entrez Gene tool from the NCBI http://www.ncbi.nlm.nih.gov/ 
gene we identified a complete set of bare genes able to produce bacteriocins in the 
vicinity of the SMU. 1548c/ 1547c locus tag (Figure la). 



Table 1 Two Components Systems found in the S. mutans UA159 genome. 


HK/RR - Locus tag (NCBI)/gene 


Identifed peptides surrounding the 


Predicted protein function 


name 


HK/RR 




SMU.45* 


SMU. 40/41 




SMU.486/487 






SMU.577/576 - 


SMU.571 


SMU.572 


lytS/lytR 




dehydrogenase/cyclohydrolase 


SMU.660/659 






SMU.928/927 






SMU.1 009/1 008 






SMU.1037c/1038c 


SMU. 1047c ? 




SMU.1 128/1 129 - 


SMU. 1131c 




ciaH/ciaR 






SMU.1 145c/1 146c 


SMU.1 147c 


Smu.1 148-1 150 abc transporter 


SMU.1516/1517 






covS/covR 






(vicK/vicR) 






SMU.1 548c/1 547c 


SMU.1 553c/1 554c 


Smu. 1550c integrale membrane 






protein, ... 


SMU.1814/1815 - 


SMU. 1818c 




scnK/scnR 






SMU.1 965c/1 964c 






SMU.1916/1917 - 






comD/comE 






SMU.1 924 - gcrR* 







# HK uncoupled to a RR 

* RR uncoupled to an HK 
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139774 1 


HKIKTSYLIYAVIATVFAGLSMVTDIFEKGNFEAIFSNLGMIIGLSILAYILITMAKSFIDLVINELKN 
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B 

Figure 1 a) Predicted bacteriocin maturation patchway. GenBank locus tag are given with the 
propeptide genes (blue), the ABC transporter genes (green), the immunity gene (majenta), the 
transcription factor gene (yellow), and the response regulator gene (red), b) Sequence alignment of the 
significant hits with gi|24379940|ref|NP_721 895.1 1 hypothetical protein SMU. 1554c [Streptococcus mutans 
UA159] as query ID using BlastP and COBALT in default parameters (NCBI). 

V J 

The cluster of genes (location: 1475357-1478860) presents the same genomic organi- 
sation than conventional bacteriocin biosynthesis operon with the genes encoding two 
small peptides (SMU.1554c and SMU.1553c), the ABC transporter genes (SMU.1552c/ 
SMU.1551c), a gene encoding an integrale membrane protein possibly involved in the 
immunity function (SMU.1550c), and the TCS genes, HK gene (SMU.1548c) and RR 
gene (SMU.1547c), probably implicated in the regulation of the biosynthesis of the bac- 
teriocin. Furthermore additionnal untypical genes were identified: a methionine amino- 
peptidase (ampM/SMU. 1556c) and a putative acetyltransferase (SMU. 1558c) with 
related function to proteases and scaffoldingproteins. 

Putative precursor peptides were analysed for the presence of a signal peptide using 
Signal-3L http://www.csbio.sjtu.edu.cn/bioinf/Signal-3L/I17] and PrediSi http://www. 
predisi.de/ [18] algorithms. 

The potential of antimicrobial activity of the putative mature peptides was evaluated 
using freely web available programs such as APD2 http://aps.unmc.edu/AP/main.php 
[19] and the AntiBP2 server http://www.imtech.res.in/raghava/antibp/f20]. Similarity 
with known antimicrobial peptides was retrieved for the query input peptide 
sequences. SMU.1553c presents similarity with the carnocyclin A peptide [21]. 

A BlastP analysis [22] of the precursor peptides reveals the strict conservation of 
these peptides with their genomic context to the Streptococcus salivarius group species 
(Figure lb). 

Upstream genomic coding sequence was analyse to detect putative promoter regions 
and transcription factor binding sites using the bacterial promoter recognition program 
BPROM (Softberry inc.) (Figure 2). 

Many putative mutacin-encoding genes have been previously predicted using bioin- 
formatic analyses and some of them were functionnally verified using mutational ana- 
lyses for S. mutans UA159 [23]. Inactivation of all putative mutacin genes did not 
abrogate complete antibacterial activity of the strain, let suggest the existence of an 
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>gi | 24378532 : 1478430-1478860 Streptococcus mutans UA159, 

complete genome 
Length of sequence- 431 
Threshold for promoters - 0.2 0 
Number of predicted promoters - 2 
Promoter Pos : 7 8 LDF- 6.88 

-10 box at pos. 63 TTTTAAAAT Score 79 

-35 box at pos. 46 TTTCCT Score 37 

Promoter Pos: 410 LDF- 2.7 9 
-10 box at pos. 395 TGACAACAT Score 31 
-35 box at pos. 377 CTGCAA Score 20 

Oligonucleotides from known TF binding sites: 

For promoter at 78: 



fhlA: 


TCATTTTC 


at 


position 


38 


Score - 


7 


lrp : 


ATTTTTTT 


at 


position 


59 


Score - 


11 


lexA: 


TTTTTTTA 


at 


position 


60 


Score - 


16 


rpoD18 : 


TTTAAAAT 


at 


position 


64 


Score - 


7 


rpoD17 : 


GTGTCATA 


at 


position 


93 


Score - 


7 


For promoter 


at 41C 


: 










carP : 


CTGTAAAA 


at 


position 


359 


Score - 


7 


rpoD17 : 


AAAAATAG 


at 


position 


381 


Score - 


9 


nagC : 


TTTAATTT 


at 


position 


420 


Score - 


7 


rpoD18 : 


TTAATTTT 


at 


position 


421 


Score - 


9 


Figure 2 Report of the BPROM promoter detection software 









other type of inhibitory substance produced which confort the reliability of our 
hypothesis and findings hither [23]. 

The group of genes detect by our method predicted the existence of a putative bac- 
teriocin maturation pathway in an exclusive group of Streptococcus and reveals its 
potential to encode for a new type of bacteriocin. It also provides mature hypothesis 
that may be test by a focused wet lab experiment. Since inactivation of small genes 
remains difficult to perform, our method study provides a computational evidence for 
identification of a new putative bacteriocin production. This method can be applied to 
a large set of short coding sequence with unknown function yet found in the strepto- 
coccal genomes [13]. 



Abbreviations 

ABC: ATP-binding cassette; BlastP: Basic Local Alignment Search Tool for protein; ORF: Open Reading frame; TCS: Two- 
component signal transduction system; HK: Histidine Kinase; RR: response regulator. 
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